An even faster algorithm for ridge regression of reduced rank data

نویسنده

  • Berwin A. Turlach
چکیده

Hawkins and Yin (Comput. Statist. Data Anal. 40 (2002) 253) describe an algorithm for ridge regression of reduced rank data, i.e. data where p, the number of variables, is larger than n, the number of observations. Whereas a direct implementation of ridge regression in this setting requires calculations of order O(np2 + p3), their algorithm uses only calculations of order O(np2). In this paper, we describe an alternative algorithm based on a factorization of the (transposed) design matrix. This approach is numerically more stable, further reduces the amount of calculations and needs less memory. In particular, we show that the factorization can be calculated in O(n2p) operations. Once the factorization is obtained, for any value of the ridge parameter the ridge regression estimator can be calculated in O(np) operations and the generalized cross-validation score in O(n) operations. © 2004 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reduced rank ridge regression and its kernel extensions

In multivariate linear regression, it is often assumed that the response matrix is intrinsically of lower rank. This could be because of the correlation structure among the prediction variables or the coefficient matrix being lower rank. To accommodate both, we propose a reduced rank ridge regression for multivariate linear regression. Specifically, we combine the ridge penalty with the reduced...

متن کامل

Topics on Reduced Rank Methods for Multivariate Regression

Topics in Reduced Rank methods for Multivariate Regression by Ashin Mukherjee Advisors: Professor Ji Zhu and Professor Naisyin Wang Multivariate regression problems are a simple generalization of the univariate regression problem to the situation where we want to predict q(> 1) responses that depend on the same set of features or predictors. Problems of this type is encountered commonly in many...

متن کامل

Sharper Bounds for Regression and Low-Rank Approximation with Regularization

We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of th...

متن کامل

Sharper Bounds for Regularized Data Fitting

We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of th...

متن کامل

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling

Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data regime, because deterministic algorithms provide interpretability to the practitioner by having no failure probability and always returning the same results. We pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 50  شماره 

صفحات  -

تاریخ انتشار 2006